Hello everyone. Can you guys hear me properly? Nice, perfect. Yeah, today I wanted to start your presentation with quite a bold claim. I would say that your web app is taking up too much RAM and we could fix it. And this like comes from a thing that I noticed recently and is that if you look at your Chrome browser, you can see that if you hover over the tab, you will see that the Chrome is now starting, at least in a while, to tell users the memory usage of your app. Which like, if you look at most applications such as for example g-tub even while looking at a pretty big diff, the memory usage is not that bad. Yeah, like I mean, 122 megabytes were a lot in the 2000 but like now it's not as much. But if you look at other websites that maybe are a bit more expensive such as Airbnb, you can see that if we load a pretty big page, the memory usage goes way up. Like we're talking about half a gig of RAM being used by the browser. And what I was wondering like is it our fault? Is it the browser? What's in that memory that is being used? And we can find out how much of that is actually used by the JavaScript virtual machine, by our variables, our functions and our code. And the way of doing that, it's by opening the DevTools and there is a special tab called memory and you can see for each JavaScript virtual machine that is running, you can see how much memory is it taking up right now. Which in case of Airbnb, it was like 111 megabytes which is like, it's not much but it starts to be quite a bit especially when GitHub was like 10 megabytes compared to that. And then maybe you look at some more extreme examples such as here I propose fully stress test notion by loading a quite big table and we went into 1.5 gigabytes of RAM used just by JavaScript variables and that was quite wild because if you think about it, that's a lot. That's a lot for a web page. And also there are even worse examples or I would say more difficult examples like the product that I'm currently building and I'm building this web-based tool called the flux which is a tool for designing electronics on your browser and it is quite complicated because electronics is made up by a lot of different parts and it's built using typeskipped React, 3JS, React refiber so we use a bunch of technologies and also a bunch of abstractions to make our life easier and that had an effect on us. In fact because we wanted to be able to render very complicated documents with a lot of different shapes and text and everything has to run at 60 FPS, you can see how holding a big project can take a lot of RAM and that's something that backfired a bit. Why? Well because originally we focused a lot on performance, we wanted to have everything load very quickly, we wanted the scroll to be fast and originally we just optimized for performance, we were like yeah memory is cheap, let's just use whatever all the memory that we have so we just optimized what the profiler said, not what the memory profiler said and actually we did this because there was this article that from a while ago that was talking about yeah if you're building React apps just memorize everything, just cache everything that you can because that is not going to be an issue in most cases. People did we know that we were one of those cases and yeah and in fact you can see how like if you load a pretty big document at least before this talk the app will take too much RAM but they can really hear someone says well okay I have 16, 32 weeks of RAM on my desktop on my computer, why do I care about memory users, we're not in 1999 anymore. Well there are still a couple of reasons why we really care about this now and one of the reasons is out-of-memory crashes. If you're not optimizing memory usage the browser will limit you. In most cases for example in Chrome if you go over four gigabytes you will get this, you will get an old snap error code 5 which is an out-of-memory and there is no way to catch it, no way to solve it, the only thing that you can do is just prevent this from happening in the first place because here you will need to refresh the page to fix it. And on iOS it's even worse because on iOS sometimes the limit goes down and no one is really clear about what the limit is. For example if you are on Safari UIOS sometimes the limit can go as low as 300 megabytes and this is what you get, you get your browser loading up the page, trying to load the page then going out of memory, refreshing the page and going in an infinite refresh loop which you will see your user report and that's when your product manager will come screaming into your office why is the application not loading on my phone and because you're using too much RAM, so yeah clients might have a lot of RAM but your browser doesn't care, it will not let you use it. And another thing is that we also care about the garbage collection performance, if the more that you allocate the more that you will need to deallocate later and that's a thing that you have to care about because in some cases the garbage collection connection times can really hurt your performance. This is a bit of an extreme case that's like one minute of garbage collection but like this is something that is a bit more realistic. We were debugging an event handler that was supposed to run on mouse move so something that was totally off path and the major garbage collector took 0.5 seconds which means that there was a sharp drop in the frame per second just because the garbage collector had to kick in and so that's another thing that you want to care if you care about performance. Also memory is part of your performance optimization strategy and another thing is that as I showed before now Chrome is showing the memory usage of your website to your users so if your users are using like 12 tabs or if you are insane like in my case you have 10 browser's opens with a thousand tabs each, yeah that should start closing them maybe tomorrow. The users will be able to see that it's your website that is taking up their entire RAM and they will not be happy with you so now they will know which one it is and so yeah we're into a situation and for example my situation how do we solve this like how we approach this problem in flux? Well first of all it's important to figure out what is occupying memory and once you do that like there are multiple strategies that you can use to kill it with fire and then we also want to make sure we're not doing the same mistake aka we can set up some checks in CI or we can set up some monitoring even with remote users to check that the memory usage is not that bad right now. Of course in the talk of today I wanted to focus more on the first point because that's already a lot of things to talk about. So before going into the tooling I wanted to introduce some ideas about memory usage so that we know what we're talking about. I like to have those distinctions while talking about memory usage this is something that I made up in my analysis and like I noticed that there is a pattern of having either static or transient memory usage. What are we talking about here? Static memory usage it's when you have variables that are taking up a lot of RAM but they are long lived, they are global variables, they are state that is staying there and it's not really changing throughout the run of your application and that's basically what you would find in a heap snapshot and it is that the easy thing for example the document that loads and it is taking up a lot of RAM but you don't necessarily have a situation like that sometimes you could have a transient peak of memory usage which means that for example the user clicks a button and that button triggers a very quick operation which allocates an array with one million elements you can see it as a peak in the memory usage at that point and that sometimes can be a bit more hard to the bug because you want to find that on a heap snapshot because a heap snapshot is just taking an image of what's in your RAM at that moment and a peak of memory that would be de-allocated immediately won't show up in that so there are different strategies depending if we have the first or the second type of memory problem. Another thing that I like to consider is the count and the size of stuff. Why? Because you might have a very happy situation the kind of analysis wise in which you're locating a 500 megabyte string or a 500 megabytes array that's very different than allocating millions of small objects and if you have the first or the second situation you need to use completely different approach to analyze that because while if you have a giant object it would just show up in the memory profiler immediately as a very big object if you have millions of small elements it would be much harder to analyze them because you will need to check what's inside those those four bytes objects and another thing that I like to bring up is the difference between shallow and retained size and these are things those are two terms that you will see in the memory profiler and the reason for that is because in JavaScript everything is a pointer so if you have an array of strings it's actually an array of pointers to strings so the array itself could be very small like in order of bytes but the stuff that is pointing to could be giant like it could be pointing to a lot of one megabyte strings so when we talk about shallow size we talk about the size of that allocation itself such as the array which is small but that array it's causing other memory to stay allocated because it's referring to those one megabyte string so the retained size is instead the total amount of memory that that object or that array is forcing to stay and that is preventing from being deallocated and there's another last topic that is also quite complicated which are allocation types and which means that in JavaScript there are multiple things that you can allocate you have different other types you have code you have strings just array you have typed arrays you have also closures and each one of those behaves differently in memory and one cool thing that you can get from this is that for example also functions are something that can take up memory if you're not careful enough because functions need to save all the variables that are around them so technically that a function is an object as well in javascript and this means that even for example if you are creating functions in a loop that could become a memory problem because it's the same thing as creating an array of objects so like sometimes you can just look up the v8 in Chrome documentation and find a lot of interesting things about how memory is used internally but that's another topic for another talk I would say I wanted to instead look into tooling like if we are in a situation in which we have a lot of memory usage what are some tools that we can use to try to start to analyze what is going on and how to solve the problem well the most famous one is the common memory compiler which is that memory tab that you probably saw next to performance in the Chrome DevTools and it's quite powerful because it can work in three different modes I think the most interesting ones are the heap snapshot and the allocation sampling which works in very different ways for different purposes but it is that with the heap snapshot you can take a big snapshot of everything that it's in your RAM everything that javascript is working with and like imagine that you created a lot of variables in your code with this you can just save all of them and look at what's inside of them which is really cool because you can even see the values that you have there and for each one of those for each allocation that you have you can also see what is the retainer that means what why is this being a memory who created it and who is holding references to it and that's useful to determine who is the thing like what is the function that caused that thing to stay in memory and the heap snapshot are very useful if you want to check stuff like static memory usage because it takes a snapshot in time if instead you're more interested into transient memory peaks as I said before there is this other tool called the allocation sampling which works by accumulating every allocation that happens this means that everything that is allocated is saved here but you don't get the allocations so which is you can't really measure how much RAM you're using you can just measure who is creating that RAM that's objects we had not too many of them but some of them were taking a lot of RAM like 89 megabytes that's a lot and we had one specific object that was taking a giant amount of memory like 80 megabytes and which is then by looking at the retainers we were able to immediately figure out what was the function that was allocating that stuff that was retaining that stuff and that was one of the very first optimization that we managed to do because this way we went into the code into that function and realized how we were basically creating a bunch of functions this is react code and a bunch of string UIDs and saving all of them in a map and apparently that's incredibly inefficient that's probably not code that you look at the first at the first glance that it seems inefficient but if you call it thousands of times this is apparently sticking up 80 megs of RAM and so how did we solve this we refactored it a bit by using a set instead of a map and so it's very experiment based and with this we were able to have like a 50 percent improvement of memory usage which was huge and this really made the difference between being able to load some project at all or projects that would just crush your browser like documents and so that was one of the first big wins that we had so we were like yeah okay let's continue this eventually we will reach zero megs of memory use right no immediately after that we hit pretty much a brick wall in which we were taking hit snapshots and we were seeing that we had two million objects that were taking a lot of space and it's not that that like we had one big object to optimize each one of them was a couple of bytes and the heat profiler really doesn't help you in those cases and that's interesting because that's pretty much the same situation that you will find if you try to profile that same notion that I've tested it before or even Airbnb as it's actually the same problem and unfortunately the answer is the problem is react kind of and like we are in the same situation also with notion we have is it two geeks of ground no that can't be that is just being occupied by a lot of small objects so yeah we hit a brick wall but what we do now like the heat profiler is very bad and analyzing those kind of stuff thankfully we can export from it we can export a giant five gigabytes json from from chrome and then we look at the json and we see that the json it's in a format that is pretty much unreadable but thankfully there is someone that did work for us the guys at meta and it is beautiful tool called memelab which it's a toolkit for exploring memory usage which is very focused on finding memory leaks it has like an entire automation for that but I think this is even more it's even cooler because it provides you a very powerful API for opening snapshots from chrome and analyzing them what you can do is that basically you can read the objects in memory and perform analytics on them for example we wanted to answer this question which type of objects are taking up the most space out of the two millions that we found in a snapshot well this is a some code that we wrote I think that we don't have time to go too much into it but I can publish it the idea is that we can load the snapshot so load the current state of memory and find all the object types like what are what is the like the type skip type of the object computed total shallow size for each type and then sort and print results and from these the results were very cool because the um which is we had the for each object type even including like the the keys of the object how much memory they were occupying and which is we were able to see that on the top two we have one object that is called fiber node which is from react and another node another object that had base q base state memo state what the next q what is that that is not something that came from our application that's react again that's the data structure that is used internally for keeping tracks of hooks and so like we went into react to me so that there was exactly that other structure which in most websites that are using react heavily nowadays is pretty much the thing that is occupying the most memory with enough so it is we figured out that keeping tracks of hooks is expensive and but are we supposed to just tear down the 400 000 lines of react that we have in our app right now like that's a bit too far into the development so we wanted to know precisely what we need to optimize so we use memlum again this time we uh we see like even more uh by looking at this fiber node data structure that is used by reactor and we need a lot of statistics on it to try to figure out what is the react component that is taking up the most memory so that we can optimize that specific component first and we managed to do this because this way we were able to divide the odd memory uses by react component and see each hook how much memory it was using and with this we were able to find out a specific react component that was using a lot of memory and we cut the memory users down again by 60 percent which was pretty nice so that's like memlum saved us with this because we were able to make our app properly working and it also made us possible to answer other questions like as out of all the strings that we have in our app how many of those are uids should we start optimizing uids and make them numbers well no because we used memlum to find all the uids and we found out it was like two megabytes in total so who cares so that's also nice to know what to not prematurely optimize so just to sum up everything that i said i think that we can all agree that memory analysis is actually difficult especially because it varies so much between application between framework between browsers but it's important even if even in a world like nowadays in which we have a lot of a lot of round because for some apps it really makes a difference it makes the difference between you being able to use the notion on your phone or the app constantly crashing and never loading your data and that thing is that the chrome profiler is cool but sometimes it's not enough but thankfully it can export so that at least you can perform your own analysis externally so thank you for listening to representation thank you are there any questions i see a question here yes you were talking about the shallow size versus retained size yeah when would you ever be interested in looking at the shallow size sounds like the more interesting one yeah he asked about when do we care about shallow size when we also have the retained size well i yeah we care a lot about shallow size in our case it was all about shallow size where to write our own custom plugging for memblab to just analyze shallow size why because if you are analyzing like very big objects there are thousands of lines and in that case you have to use tricks like even virtual scrolling if you know that you can have like instead of allocating all the DOM elements you keep reusing the same ones and you think about that that's like ejecting from react because you are creating something just with javascript and the DOM and then you are creating a reactive wrapper for it so that's another thing that it shows that yeah react is good at orchestrating stuff but when it comes to the performance critical things that you want to have inside your application then you need to start optimizing it differently just a small mark or we continue with the questions so please if there are spaces please try to squeeze and not leave spaces in the middle as you could see we have hundreds of people waiting outside and here as well and we cannot have that many people on the sides so please try to squeeze don't let free seats for your jackets or something put it on your lap thank you and since we're starting to be quite a lot if you're going to go out please try to go out from the right side and avoid going out from the left side so that it's easier for everyone thank you we have a question here first i've got more more as a comment instead of a question so the thing is that with this limitation of four gigabytes for memory this comes from the fact that like chrome like compresses pointers so that small objects take less space basically that's one thing second thing is that is this is like a security mitigation so that when there is some like back in v8 it's harder to exploit it but also i've read on like a chromium box tracker that there is for example 16 gigabytes limit for fixed arrays so there may be different limitations for different things like web assembly also has a different limitation and also supposedly like electron abs doesn't have limits so yeah yeah that that's very cool thank you i think that firefox has pretty much the same limitations oh ask me if we're also trying with other browser yeah i'm mostly working on firefox actually and firefox has very similar limitation and sometimes it's even worse because sometimes we notice that the upper randomly sometimes takes more memory in firefox for some reason or some things are more optimized in firefox other things are more optimized in chrome so that that's very complicated to answer unfortunately because it seems like that the answer is either you look deeply into the source code of the browsers which is i still haven't reached that point unfortunately or you do try an error ah no the tooling um you know firefox also has tooling around it which is actually if i remember correctly more focused around analyzing the memory users of the DOM elements and it also has some facilities for for analyzing ip snapshots but since like memlab users works with chrome ip snapshots we went with that immediately and how do you go about running this in ci? oh um yeah that's a complicated thing because running in ci it's pure pain like you can use memlab and run it in ci because it uses playwright i don't remember if it uses playwright or puppeteer i think puppeteer and with it you can like orchestrate some some tests that open a page it can even like use some machine learning algorithm to find memory leaks the problem with doing that is that it's fine if your app is small if your app starts to become bigger then uh you will need to have a ci machine that is powerful enough to be able to run your app and the profiler on top of it which for us it meant that the the ci time went like in 30 minutes which was unacceptable so eventually we removed it but you can do it are there questions as your question there so from the browser or something like that? yeah that's another complicated thing because if you are using chrome i don't think that firefox allows you that but chrome does you have a specific performance or memory i think variable that you can use and you can check both the maximum heap allowed size and you can also read an estimate of the current family usage in our case once we do that we are constantly like giving data to segment then we analyze in amplitude with which we can like keep track of memory usage and we are also doing that for like the performance timing the problem is that we notice that that data who very quickly becomes bogus because it depends a lot on what the user is doing and when the garbage collector kicks in because the garbage collector sometimes is like it goes up to four gigabytes and then no problem goes down to 500 megabytes so it's extremely difficult to capture memory usage because you don't have a precise memory a precise measure on how much of the total retained memory is active and how much is actually inactive and going to be garbage collected soon so we try to do it and we have some charts showing how much memory is being used but it's very hard to make sense of them unfortunately any other questions you still have around five minutes for questions